ANNOTATED Swadesh wordlists for the Tsezic group (North Caucasian family).

Languages included: Hunzib (proper) [tsz-huz], Bezhta proper [tsz-bez], Khoshar-Khota Bezhta [tsz-bek], Tlyadal Bezhta [tsz-bet], Hinukh [tsz-gin], Kidero Dido [tsz-ddo], Sagada Dido [tsz-dds], Khwarshi proper [tsz-khv], Inkhokwari Khwarshi [tsz-khi].

Data sources.

General:
Bokarev 1959 = Е. А. Бокарев. Цезские (дидойские) языки Дагестана. Москва, 1959. // Individual grammar sketches of the Tsezic languages.

Comrie & Khalilov 2010 = Б. Комри, М. Халилов. Словарь языков и диалектов народов Северного Кавказа. Сопоставление основной лексики. Лейпциг/Махачкала, 2010 [B. Comrie, M. Khalilov. The dictionary of languages and dialects of the peoples of the Northern Caucasus. Comparison of the basic lexicon. Leipzig/Makhachkala, 2010] // A thematic glossary of East Caucasian and some neighbouring languages. See http://lingweb.eva.mpg.de/ids/ for on-line access and detail. The source is actually somewhat unreliable, with a considerable number of erroneous forms.

Dirr 1909 = А. М. Дирр. Материалы для изучения языков и наречий андо-дидойской группы. In: Сборник материалов для описания местностей и племен Кавказа, вып. 40. // Short grammar sketches and a comparative glossary of some Andian and Tsezic languages.

Kibrik & Kodzasov 1988 = А. Е. Кибрик, С. В. Кодзасов. Сопоставительное изучение дагестанских языков: Глагол. Москва, 1988. // A thematic glossary of verbs of East Caucasian languages. Supplemented by short sketches of the verb system of individual languages.

Kibrik & Kodzasov 1990 = А. Е. Кибрик, С. В. Кодзасов. Сопоставительное изучение дагестанских языков: Имя, фонетика. Москва, 1990. // A thematic glossary of nouns of East Caucasian languages. Supplemented by short sketches of the phonetic and nominal systems of individual languages.

Koryakov 2006 = Ю. Б. Коряков. Атлас кавказских языков. С приложением полного реестра языков. Москва, 2006. // Detailed color maps of the modern areas of North East Caucasian, North West Caucasian and Kartvelian (South Caucasian) languages with excourses in history.

NCED = S. L. Nikolayev, S. A. Starostin. A North Caucasian Etymological Dictionary. Moscow: Asterisk Publishers, 1994. Reprint in 3 vols.: Ann Arbor: Caravan Books, 2007. // Monumental etymological dictionary of the North Caucasian (Nakh-Daghestanian, a.k.a. Northeast Caucasian + Abkhaz-Adyghe, a.k.a. Northwest Caucasian) language family. In addition to approximately 2000 roots, reliably or tentatively reconstructed for Proto-North Caucasian, also provides intermediate reconstructions for the protolanguages of the daughter branches: Nakh, Avar-Andian, Tsezian, Dargwa, Lezgian, Abkhaz-Adyghe. Tables of correspondences and detailed notes are given in the introduction, available online at http://starling.rinet.ru/Texts/caucpref.pdf. All etymologies also available online on the StarLing database server, at http://starling.rinet.ru/cgi-bin/main.cgi?flags=eygtnnl.

TsezEDb = S. A. Starostin. Tsezian Etymological Database. // Computerized version of the Proto-Tsezic corpus, available at http://starling.rinet.ru/cgi-bin/main.cgi?flags=eygtnnl. Includes some Proto-Tsezic etymologies (mostly basic lexicon items) that have not been included in [NCED] due to their lack of external cognates in other branches of North Caucasian.

I. Hunzib (proper):

Bokarev 1961 = Е. А. Бокарев. Материалы к словарю гунзибского языка. In: Вопросы изучения иберийско-кавказских языков. Москва, 1961. P. 147-182. // Hunzib-Russian and Russian-Hunzib glossary, based on the Hunzib proper dialect.

Isakov & Khalilov 2001 = И. А. Исаков, М. Ш. Халилов. Гунзибско-русский словарь. Москва, 2001. // Hunzib-Russian dictionary based on the Hunzib proper dialect with specific Garbutli and Naxada words quoted. Ca. 7000 entries. Supplemented with a grammar sketch.

Isakov & Khalilov 2012 = И. А. Исаков, М. Ш. Халилов. Гунзибский язык. Фонетика, морфология, словообразование, лексика, тексты. Махачкала, 2012. // Descriptive grammar of the Hunzib language, based on the Hunzib proper dialect.

van den Berg 1995 = H. van den Berg. A grammar of Hunzib. With texts and lexicon. Leiden, 1995 (also published under the same title as: München & Newcastle: Lincom Europe, 1995). // Descriptive grammar of the Hunzib language, based on the Hunzib proper dialect. Supplemented with texts and a glossary.

II. Bezhta (Bezhta proper, Khoshar-Khota, Tlyadal)

Khalilov 1995 = М. Ш. Халилов. Бежтинско-русский словарь. Махачкала, 1995. // A Bezhta-Russian dictionary (ca. 7000 entries), based on the Bezhta proper dialect with specific words from other dialects quoted. Supplemented with a Russian-Bezhta index and a grammar sketch.

Kibrik & Testelets 2004 = A. E. Kibrik & Ya. G. Testelets. Bezhta. In: M. Job (ed.). The Indigenous Languages of the Caucasus, vol. 3, part 1. Caravan Books, 2004. P. 217-295. // A grammar sketch of Bezhta, based on the Tlyadal dialect.

Madieva 1965 = Г. И. Мадиева. Грамматический очерк бежтинского языка (по данным говора с. Бежта). Махачкала, 1965. // A descriptive grammar of the Bezhta language, based on the Bezhta proper dialect. Supplemented with a glossary. Transcription of the Bezhta forms is not very reliable.

III. Hinukh

Forker 2013 = D. Forker. A Grammar of Hinuq. Berlin/Boston: De Gruyter Mouton, 2013. // Descriptive grammar of the Hinukh language.

Isakov & Khalilov 2004 = I. A. Isakov & M. Š. Xalilov. Hinukh. In: M. Job (ed.). The Indigenous Languages of the Caucasus, vol. 3, part 1. Caravan Books, 2004. P. 167-214. // A grammar sketch of the Hinukh language.

Khalilov & Isakov 2005 = М. Ш. Халилов, И. А. Исаков. Гинухско-русский словарь. Махачкала, 2005. // Hinukh-Russian dictionary, ca. 7500 entries. Supplemented with a grammar sketch.

Lomtadze 1963 = Э. А. Ломтадзе. Гинухский диалект дидойского языка. Тбилиси, 1963. // Descriptive grammar of the Hinukh language.

IV. Dido (Kidero, Sagada)

Main sources
Abdulaev 2014 = Unpublished field records of the Sagada dialect of Dido by Arsen Abdulaev, February 2014.

Alekseev & Radzhabov 2004 = M. E. Alekseev & R. N. Radžabov. Tsez. In: M. Job (ed.). The Indigenous Languages of the Caucasus, vol. 3, part 1. Caravan Books, 2004. P. 115-163. // A grammar sketch of the Dido language, based mostly on the Kidero dialect.

Imnaishvili 1963 = Д. С. Имнайшвили. Дидойский язык в сравнении с гинухским и хваршийским языками. Тбилиси, 1963. // Descriptive grammar of the West Tsezic languages: Dido, Hinukh, Khvarshi.

Khalilov 1999 = М. Ш. Халилов. Цезско-русский словарь. Москва, 1999. // Dido-Russian dictionary based on the Kidero dialect with specific forms of other dialects sporadically quoted. Ca. 7500 entries. Supplemented with a grammar sketch.

Additional sources
Comrie 2007 = B. Comrie. Tsez (Dido) Morphology. In: Alan S. Kaye (ed.). Morphologies of Asia and Africa. Eisenbrauns, 2007. P. 1193-1204.

Comrie et al. 1998 = B. Comrie, M. Polinsky, R. Rajabov. Tsezian languages. Unpubl. ms, 1998, available at: http://scholar.harvard.edu/mpolinsky/publications

Maddieson et al. 1996 = I. Maddieson, R. Rajabov, A. Sonnenschein. The main features of Tsez phonetics. In: UCLA WPP 93: Fieldwork Studies of Targeted Languages IV. 1996. P. 94-110.

V. Khwarshi (proper, Inkhokwari)

Karimova 2014 = Unpublished field records of three Khwarshi dialects (Khwarshi proper, Inkhokwari, Kwantlada) by Raisat Karimova, February 2014.

Khalilova 2009 = A Grammar of Khwarshi. Proefschrift ter verkrijging van de graad van Doctor aan de Universiteit Leiden, 17 december 2009. // Descriptive grammar of the Kwantlada dialect of the Khwarshi language.

Sharafutdinova & Levina 1961 = Р. Шарафутдинова, Р. Левина. Хваршинский язык (предварительное сообщение). In: Вопросы изучения иберийско-кавказских языков. Москва, 1961. P. 89-122. // Grammar sketch of Khwarshi proper.

NOTES

I. Hunzib (proper)

I.1. General.

The Hunzib (Gunzib) language consists of three dialects: Hunzib proper, Garbutli, Naxada. The three are very close to each other, see [van den Berg 1995: 348 f.; Isakov & Khalilov 2012: 337 ff.] for the main discrepancies. It seems, however, that there are several mismatches between Hunzib proper and Naxada or Garbutli within the 110-item wordlist: see ‘mouth’, ‘red’, ‘yellow’ and possibly ‘to come’.

The available linguistic sources are based on Hunzib proper. The primary lexicographic source for Hunzib proper is the dictionary [Isakov & Khalilov 2001], plus the glossaries in [Kibrik & Kodzasov 1990; van den Berg 1995; Bokarev 1961]. Some forms and grammatical information have been taken from [Isakov & Khalilov 2012; van den Berg 1995; Bokarev 1959: 14-65]. Hunzib lexical data are systematically quoted in [Comrie & Khalilov 2010], but we prefer not to use this source due to its general unreliability.

I.2. Transliteration.

The following transliterational chart covers our principal sources:

[Isakov & Khalilov 2001] [van den Berg 1995] [Kibrik & Kodzasov 1990] [NCED] GLD
б b b b b
п p p p p
пI
д d d d d
т t t t t
тI
ц c c c c
цI
з z z z z
с s s s s
ч č č č č
чI čʼ čʼ č̣ čʼ
ж ž ž ž ž
ш š š š š
лI ƛ L ƛ ƛ
кь ƛʼ ƛʼ ƛʼ
лъ λ ɬ λ ɬ
г g g g g
к k k k k
кI
хь x x x
хъ q q q q
къ
гъ R ʁ ʁ
х x X χ χ
гI ʕ ʕ ʕ ʕ
хI ħ H ħ ħ
ъ ʔ ʔ ʔ ʔ
гь h h h h
м m m m m
н n n n n
р r r r r
л l l l l
в w w w w
й j j j y
CC CC CC CC
и i i i i
е, э e e e e
а а a a a
о о o o o
у u u u u
а̇ α ɵ ɔ ɑ
ǝ ǝ ǝ ǝ ǝ
ы ɨ ɨ ɨ ɨ
Vн
VV

1. We treat geminated consonants as single long units (e.g., cʼː sː lː and so on) instead of bi-phonemic clusters (tt cʼcʼ ss ll and so on) in other sources. The geminates only occur in the intervocalic position in adjectives and more rarely in adverbs [van den Berg 1995: 25]; these originate from the consonant clusters with the adjective suffix -y-; note that the available sources are rather inconsistent in their transcription of geminates and plain variants.

2. Voiceless stops and affricates (t, č and so on) as actually aspirated (tʰ, čʰ and so on).

3. Velar x and pharyngeal ʕ ħ are restricted to loanwords.

4. The vowels ɑ ǝ ɨ are retracted, see [Kibrik & Kodzasov 1990: 332] for details.

5. Nasalized vowels tend to denasalize in the speech of younger generations.

II. Bezhta (Bezhta proper, Khoshar-Khota, Tlyadal)

II.1. General.

The Bezhta language consists of three dialects: Bezhta proper, Khoshar-Khota (Khocharkhotin), Tlyadal (Tlyadaly, Tliadal). All three are quite close to each other.

The primary sources on Bezhta proper is the dictionary [Khalilov 1995] and the grammar [Madieva 1965]. Khoshar-Khota Bezhta vocabulary is available in [Kibrik & Kodzasov 1988; Kibrik & Kodzasov 1990]. The main sources on Tlyadal Bezhta are the glossaries [Kibrik & Kodzasov 1988; Kibrik & Kodzasov 1990] and two grammar sketches: [Bokarev 1959: 66-109; Kibrik & Testelets 2004].

Bezhta lexical data are systematically quoted in [Comrie & Khalilov 2010]: Bezhta proper, Khoshar-Khota, Tlyadal and additionally the Karauzek sub-dialect of Tlyadal. Since Madzhid Khalilov is a Bezhta native speaker, we sometimes resort to [Comrie & Khalilov 2010] as an additional source, despite the general unreliability of this dictionary.

The following slots remain empty due to scarcity of available linguistic documentation. Khoshar-Khota & Tlyadal: ‘person’, ‘to swim’; only Khoshar-Khota: ‘all’, ‘good’, ‘not’, ‘that’, ‘this’.

II.2. Transliteration.

The following transliterational chart covers our principal sources:

[Khalilov 1995] [Kibrik & Kodzasov 1990] [NCED] GLD
б b b b
п p p p
пI
д d d d
т t t t
тI
ц c c c
цI
з z z z
с s s s
ч č č č
чI čʼ č̣ čʼ
ж ž ž ž
ш š š š
лI L ƛ ƛ
кь ƛʼ ƛʼ
лъ ɬ λ ɬ
г g g g
к k k k
кI
хъ q q q
къ
гъ R ʁ ʁ
х X χ χ
гI ʕ ʕ ʕ
хI H ħ ħ
ъ ʔ ʔ ʔ
гь h h h
м m m m
н n n n
р r r r
л l l l
в w w w
й j j y
CC CC CC
и i i i
е, э e e e
а a a a
аь ä ä ä
о o o o
оь ö ö ö
у u u u
уь ü ü ü
Vн

1. We treat geminated consonants as single long units (e.g., tʼː cː šː and so on) instead of bi-phonemic clusters (tʼtʼ cc šš and so on) in other sources. In most cases, these geminates originate from contraction with suffixal y.

2. Voiceless stops and affricates (t, č and so on) are slightly aspirated (tʰ, čʰ and so on).

3. In [Kibrik & Testelets 2004], the additional front vowel æ is listed as phonologically opposed to e and ä; all other sources, including our transcription, do not distinguish between ä and æ.

4. Kodzasov describes the Tlyadal prosodic system as tonal [Kibrik & Kodzasov 1990: 331] with several tones distinguished. Three are basic register tones: low ˨, mid ˧, high ˦. Four more are rare contour tones: low-mid ˩˧, mid-high ˧˥, high-mid ˥˧, mid-low ˧˩. These tones are only marked for Tlyadal in [Kibrik & Kodzasov 1990] (and also mentioned in [Kibrik & Testelets 2004: 220 f.]), and, strictly speaking, it is not entirely clear whether the Tlyadal prosodic oppositions are indeed tonal or not. We do not quote the tonal transcription.

5. As reported by Kodzasov, in the Khoshar-Khota dialect, ä ö ü are epiglottalized vowels, causing automatic assimilation of adjacent ʔ h > ʢ ʜ [Kibrik & Kodzasov 1990: 331]. Apparently the same is true for the Bezhta proper dialect, where, in addition, this assimilation also affects the uvulars (q qʼ ʁ χ). In [Khalilov 1995: 390], ä ö ü in combination with the uvulars (q qʼ ʁ χ) and the glottal stop (ʔ) are described as pharyngealized and sporadically (not always!) transcribed in Cyrillic orthography with the additional signs {ʻ} for consonants and {I} for vowel: e.g., qöqilö ‘rough, coarse, rude’ is quoted as {хъоьхъилоь} in [Khalilov 1995: 264], but as {хъʻоIхъило} in [Khalilov 1995: 390]. Ya. Testelets (p.c.) has pointed out, however, that historically an epiglottal or pharyngeal prosody or the pharyngeal fricatives should be primary in all these cases; particularly, ä ö ü secondarily originate from a o u in such a pharyngeal context. When Khalilov's notation with {ʻ} or {I} is available, we quote the transcription with pharyngealization ˤ in the notes.

6. Following common practice, we do not note the initial glottal stop (ʔ), which has the status of an automatic prothesis in the case of vocalic onset [Kibrik & Testelets 2004: 221]. It should be noted that in the Bezhta proper dictionary [Khalilov 1995], vocalic onset can be explicitly written as {ъV-} (not simple {V-}); this is usual for initial front vowels or onomatopoeic forms.

7. For vowel and consonant harmony, see [Kibrik & Testelets 2004: 221 ff.; NCED: 113].

III. Hinukh

III.1. General.

The primary sources on Hinukh are the dictionaries [Khalilov & Isakov 2005; Kibrik & Kodzasov 1990] and the grammars [Forker 2013; Lomtadze 1963; Imnaishvili 1963], plus the grammar sketches [Isakov & Khalilov 2004; Bokarev 1959: 110-142].

Hinukh lexical data are systematically quoted in [Comrie & Khalilov 2010], but we prefer not to use this source due to its general unreliability.

III.2. Transliteration.

The following transliterational chart covers our principal sources:

[Isakov & Khalilov 2004] [Forker 2013] [Kibrik & Kodzasov 1990] [NCED] GLD
б b b b b
п p p p p
пI
f f
д d d d d
т t t t t
тI
ц c c c c
цI
з z z z z
с s s s s
ч č č č č
чI čʼ čʼ č̣ čʼ
ж ž ž ž ž
ш š š š š
лI ƛ L ƛ ƛ
кь ƛʼ ƛʼ ƛʼ
лъ ɬ ɬ λ ɬ
г g g g g
к k k k k
кI
гв
кв
кIв kʼʷ kʼ˳ ḳʷ kʼʷ
хъ q q q q
къ
хъв
къв qʼʷ qʼ˳ q̇ʷ qʼʷ
гъ x R ʁ ʁ
х ɣ X χ χ
гъв ʁʷ ʁʷ
хв ɣʷ χʷ χʷ
гI ʡ ʕ (ʡ) ʕ ʡ
хI ħ H ħ ħ
ъ ʔ ʔ ʔ ʔ
гь h h h h
м m m m m
н n n n n
р r r r r
л l l l l
в w w w w
й y j j y
CC CC CC CC
и i i i i
е, э e e e e
а a a a a
о o o o o
у u u u u
и ü ü ü ü
I ʡ I I ˤ

1. We treat geminated consonants as single long units (e.g., tʼː cː šː and so on) instead of bi-phonemic clusters (tʼtʼ cc šš and so on) as is done in other sources. In most cases, these geminates originate from contraction with a suffixal consonant (normally y).

2. Voiceless stops and affricates (t, č and so on) are slightly aspirated (tʰ, čʰ and so on) [Kibrik & Kodzasov 1990: 329; Forker 2013: 28].

3. The uvular stops (q etc.) are in fact affricates (qχ etc.) [Forker 2013: 28; Isakov & Khalilov 2004: 168].

4. Fricative f is restricted to recent Russian loanwords.

5. Following common practice, we do not note the initial glottal-stop (ʔ), which is an automatic prothesis in the case of vocalic onset [Forker 2013: 30].

6. It is reported in [Forker 2013: 23 f.] that the vowels i ü u o possess lax and tense variants (lax ɪ ʏ ʊ ɔ vs. tense i ü u o), but the same source states that this opposition is not phonemic.

7. The vowel ü is restricted to the speech of the older generation; younger speakers replace it with i (or very occasionally with u) [Kibrik & Kodzasov 1990: 330; Forker 2013: 24].

8. The prosodic feature of pharyngealization is realized as epiglottalization (for the sake of convenience we transcribe it as pharyngealization ˤ); it is residually observed in several words [Kibrik & Kodzasov 1990: 330], see the discussion in [Forker 2013: 26 f.].

IV. Dido (Kidero, Sagada)

IV.1. General.

The Dido (Tsez, Cez) language consists of several dialects: Kidero, Asakh, Mokok, Shaytl, Shapikh and Sagada (Sahada). Out of these, Sagada is the most distinct one, so that it is sometimes stated that Dido consists of just two dialects - Dido proper (with the aforementioned sub-dialects) and Sagada.

Available lexicographic data are sufficient for the compilation of two lists for two main dialects: Kidero and Sagada.

The main sources on Kidero are the dictionaries [Khalilov 1999; Kibrik & Kodzasov 1990] and the grammatical descriptions [Alekseev & Radzhabov 2004; Imnaishvili 1963; Bokarev 1959: 175-221].

The main source on Sagada is the 110-item wordlist [Abdulaev 2014], compiled in accordance with the GLD semantic specifications. This list was recorded by Arsen Abdulaev in Makhachkala, February 2014 from one informant. Name: Khizri Makhmudov (Хизри Махмудов), male, born in Sagada village (Tsuntinsky district, Dagestan, Russia) in 1972, lives in Yurkovka village (Tarumovsky district, Dagestan), high education, Sagada native speaker, also speaks Russian, Avar, Chamalal. Some Sagada forms and grammatical information have been taken from [Imnaishvili 1963; Khalilov 1999].

For other dialects, cf. the Asakh grammar sketch [Comrie et al. 1998] and the Mokok grammar sketch [Comrie 2007]. When dialect material is available, we quote it in the notes.

Dido lexical data (Mokok, Sagada) are systematically quoted in [Comrie & Khalilov 2010], but we prefer not to use this source due to its general unreliability.

IV.2. Transliteration.

The following transliterational chart covers our principal sources:

[Khalilov 1999] [Maddieson et al. 1996] [Kibrik & Kodzasov 1990] [NCED] GLD
б b b b b
п p p p p
пI
д d d d d
т t t t t
тI
ц ts c c c
цI tsʼ
з z z z z
с s s s s
ч č č č
чI tʃʼ čʼ č̣ čʼ
ж ʒ ž ž ž
ш ʃ š š š
лI L ƛ ƛ
кь tɬʼ ƛʼ ƛʼ
лъ ɬ ɬ λ ɬ
г g g g g
к k k k k
кI
хъ q q q q
къ
гъ ʁ R ʁ ʁ
х χ X χ χ
гI ʡ ʕ ʕ ʡ
хI ʜ H ħ ħ
ъ ʔ ʔ ʔ ʔ
гь h h h h
м m m m m
н n n n n
р r r r r
л l l l l
в w w w w
й j j j y
Cw
CC CC CC CC
и i i i i
е, э e e e e
а a a a a
аь ä ä ä
о o o o o
у u u u u
I ˤ I I ˤ

1. We treat geminated consonants as single long units (e.g., tʼː cː šː and so on) instead of bi-phonemic clusters (tʼtʼ cc šš and so on) in other sources. In most cases, these geminates originate from contraction with a suffixal consonant (normally y).

2. Voiceless stops and affricates (t, č and so on) are slightly aspirated (tʰ, čʰ and so on) [Kibrik & Kodzasov 1990: 329; Forker 2013: 28].

3. In inherited words, ʔ and ʡ are restricted to the initial position, where ʡ is an allophone of ʔ in pharyngealized (see below) forms. Following common practice, we do not note such an automatic initial ʔ.

4. Pharyngealization ˤ is a prosodic feature which spreads over the whole phonetic word. If there are no uvular obstruents in a phonetic word, pharyngealization is transcribed for the first vowel. Otherwise, pharyngealization is noted after the first uvular obstruent (q qː χ ʁ).

5. Labialization ʷ is restricted to velar and uvular obstruents in the prevocalic position (kʷV, qʷV and so on).

6. The vowel ä is described in [Khalilov 1999; Kibrik & Kodzasov 1990; Imnaishvili 1963; Bokarev 1959], but specific forms with ä do not coincide between these sources (a wordform with ä in one source can correspond to a wordform with a or e in another). On the contrary, in [Maddieson et al. 1996; Alekseev & Radzhabov 2004], the phoneme ä is not identified at all.

V. Khwarshi (proper, Inkhokwari)

V.1. General.

The Khwarshi (Khvarshi, Xvarshi) language consists of five dialects: Khwarshi proper, Inkhokwari, Kwantlada, Santlada, Khwayni. Out of these, Kwantlada, Santlada and Khwayni are very close to each other, and they all are close to Inkhokwari as opposed to distinct Khwarshi proper (see [Khalilova 2009: 4] for detail). Frequently Khwarshi proper and Inkhokwari (with Kwantlada, Santlada, Khwayni) are treated as two separate languages, referred to respectively as simply Khwarshi and Inkhokwari.

The available lexicographical data are sufficient for compiling three lists: Khwarshi proper, Inkhokwari Khwarshi and Kwantlada Khwarshi. Actually, no lexicostatistical mismatches between the Inkhokwari and Kwantlada 110-item wordlists have been revealed, so we prefer to allocate Kwantlada data within the notes section on Inkhokwari rather than offer a separate wordlist for Kwantlada.

The main source for Khwarshi proper is the 110-item wordlist [Karimova 2014], compiled in accordance with the GLD semantic specifications. This list was recorded by Raisat Karimova in Oktyabrskoe village, Khasavyurtovsky district, Dagestan, Russia, February 2014 from one informant. Name: Zaynap Magomedova (Зайнап Магомедова), female, born 1962, lives in Oktyabrskoe (before marriage, lived in Mutsalaul village, Khasavyurtovsky district), high education, Khwarshi proper native speaker, also speaks Kwantlada Khwarshi (her husband is Kwantlada native speaker), Avar and Russian. Some Khwarshi proper forms and grammatical information have been taken from grammar sketches [Sharafutdinova & Levina 1961; Imnaishvili 1963]. An additional source is [NCED], whose authors use lexical data collected by the Tsezic enthusiast Ramazan Radzhabov (incorrectly named as Radzhibov in [NCED: 6] and Nadzhipov in [NCED: 11]). Khwarshi proper lexical data are systematically quoted in [Comrie & Khalilov 2010], but we prefer not to use this source due to its general unreliability.

The main sources for Inkhokwari Khwarshi are the noun glossary [Kibrik & Kodzasov 1990] plus the 110-item wordlist [Karimova 2014], compiled in accordance with the GLD semantic specifications. This list was recorded by Raisat Karimova in Oktyabrskoe village, Khasavyurtovsky district, Dagestan, Russia, February 2014 from one informant. Name: Dzhamilya Mirzoeva (Джамиля Мирзоева), female, born 1973, lives in Oktyabrskoe, higher education, works as a school teacher, Inkhokwari native speaker, also speaks Avar and Russian. Some Inkhokwari forms and grammatical information have been taken from [Imnaishvili 1963; Bokarev 1959: 143-174]. Inkhokwari Khwarshi lexical data are systematically quoted in [Comrie & Khalilov 2010], but we prefer not to use this source due to its general unreliability.

The main source for Kwantlada Khwarshi is the 110-item wordlist [Karimova 2014], compiled in accordance with the GLD semantic specifications. This list was recorded by Raisat Karimova in Oktyabrskoe village, Khasavyurtovsky district, Dagestan, Russia, February 2014 from one informant. Name: Khalizha Magomedova (Халижа Магомедова), female, born 1970, lives in Oktyabrskoe, higher education, works as a school teacher, Kwantlada Khwarshi native speaker, also speaks Russian. Many Kwantlada forms and grammatical information have been taken from the grammar description [Khalilova 2009].

V.2. Transliteration.

The following transliterational chart covers our principal sources:

[Sharafutdinova & Levina 1961; Karimova 2014] [Khalilova 2009] [Kibrik & Kodzasov 1990] [NCED] GLD
б b b b b
п p p p p
пI
ф f f
д d d d d
т t t t t
тI
ц c c c c
цI
з z z z z
с s s s s
ч č č č č
чI čʼ čʼ č̣ čʼ
ж ž ž ž ž
ш š š š š
лI λ L ƛ ƛ
кь λʼ ƛʼ ƛʼ
лъ ɬ ɬ λ ɬ
г g g g g
к k k k k
кI
x
хъ q q q q
къ
гъ ɣ R ʁ ʁ
х x X χ χ
гI ʕ ʕ ʕ ʕ
хI ħ H ħ ħ
ъ ʔ ʔ ʔ ʔ
гь h h h h
м m m m m
н n n n n
р r r r r
л l l l l
лʼ l
в w w w w
й j j j y
C̄, CC CC CC CC
и i i i i
е, э e e e e
а a a a a
аь ä ä ä ä
о o o o o
у u u u u
ы ɨ ɨ ɨ ɨ
Ṽ, Vн Vn
I ˤ I, Ṿ I ˤ

1. We treat geminated consonants as single long units (e.g., tʼː cː šː and so on) instead of bi-phonemic clusters (tʼtʼ cc šš and so on) in other sources. In most cases, these geminates originate from contraction with a suffixal consonant (normally y).

2. Voiceless stops and affricates (t, č and so on) are slightly aspirated (tʰ, čʰ and so on) [Kibrik & Kodzasov 1990: 327].

3. Some consonant phonemes are normally or totally restricted to Tindi, Avar and Russian loanwords: f, w, x, ʕ, ħ, labialized sibilants.

4. Following common practice, we do not note the initial glottal-stop (ʔ), which is an automatic prothesis in the case of vocalic onset [Forker 2013: 30].

5. Pharyngealization (which is characteristic of non-Khwarshi proper dialects) ˤ is a prosodic feature which spreads over the entire phonetic word. If there are no uvular obstruents in the phonetic word, pharyngealization is transcribed for the first vowel. Otherwise, pharyngealization is noted after the first uvular obstruent (q qː χ ʁ).

6. Palatalized is characteristic of non-Khwarshi proper dialects. In most cases, it is an automatic variant of l. The shift l > lʸ occurs after e, after i or in pharyngealized forms [Kibrik & Kodzasov 1990: 327; Khalilova 2009: 19 f.]. Nevertheless, there is a small number of instances of in other contexts [Khalilova 2009: 20] that makes phonemic.

7. Kodzasov describes the Inkhokwari prosodic system as tonal [Kibrik & Kodzasov 1990: 327] with several tones distinguished. These tones are not mentioned in other Khwarshi sources, and, strictly speaking, it is not entirely clear whether the Inkhokwari prosodic oppositions are indeed tonal or not. We do not quote the tonal transcription.

8. As follows from the field records in [Karimova 2014], modern speakers tend to drop pharyngealization and nasalization of vowels, as well as labialization of consonants.

VI. Proto-Tsezic

VI.1. General.

The only systematic published reconstruction of the Proto-Tsezic phonological system and etymological corpus belongs to Sergei Nikolaev (Nikolayev), although, of course, the reconstruction acknowledges its debt to previous research, conducted by some preceding Caucasologists, of which E. A. Bokarev deserves primary mention. S. Nikolaev’s reconstruction was included in [NCED] and published electronically as Tsezic Etymological Database [TsezEDb] on the StarLing database server. It must be noted that [TsezEDb] only includes those Proto-Tsezic morphemes for which external North Caucasian etymology has been proposed by the authors of [NCED], and the Swadesh words of individual languages (even if these lack external North Caucasian comparanda). Some further corrections and additions to Proto-Tsezic vowel reconstruction were proposed by Ya. Testelets.

In reconstructing the Swadesh wordlist for Proto-Tsezic, we generally follow [NCED] and [TsezEDb], although in some cases we revise and, occasionally, even reject Nikolaev's specific etymologies (this mostly has to do with new Tsezic data that have been published since the mid-1990s). The optional letters A or B after a Proto-Tsezic form denote a specific prosodic class [NCED: 75 f., 113 f.]. Due to unclear reasons, Nikolaev tends not to project pharyngealization onto Proto-Tsezic forms when this prosodic feature is attested in ancestral forms in daughter languages. We reconstruct Proto-Tsezic pharyngealization in such cases. The threefold opposition *ɬ / *ʫ / *l, postulated in [NCED: 110], is typologically rather problematic, but we provisionally leave it as is.

The main phylogenetic methods (Neighbor joining, UPGMA, Bayesian MCMC, Maximum parsimony) propose a twofold division into East Tsezic and West Tsezic, but differ in the topology of the West Tsezic cluster: some of the methods suggest a Hinukh-Dido unity opposed to Khwarshi (MP, UPGMA), others suggest a Dido-Khwarshi unity opposed to Hinukh (MCMC, NJ).

Below we examine the reverse lexicostatistical distances for two East Tsezic lects (Hunzib proper, Bezhta proper) and three West Tsezic lects (Hinukh, Kidero Dido, Khwarshi proper), higher percentage of the shared basic vocabulary meaning greater closeness.

Hunzib proper (ETs) Bezhta proper (ETs) Kidero Dido (WTs) Khwarshi proper (WTs)
Hinukh (WTs) 62.9% 63.6% 76.4% 69.6%
Hunzib proper (ETs) - 86.8% 55.3% 55.4%
Bezhta proper (ETs) - - 54.3% 55.9%
Kidero Dido (WTs) - - - 76.2%

If we exclude Hinukh, the lexicostatistical distances between the four remaining lects fulfil the condition of additivity: two East Tsezic lects are close to each other (86.8%), two West Tsezic lect are close to each other (76.2%), whereas any East Tsezic lect is equally remote from any West Tsezic lect (ca. 55%).

The configuration gets abnormal, however, when Hinukh is introduced.

First, distances between three West Tsezic lects do not fulfil the condition of additivity: Kidero Dido is equally close to Khwarshi and Hinukh (76.2% ~ 76.4%), whereas Khwarshi and Hinukh are remote from each other (69.6%). It means that there should be a number of parasitic, i.e., secondary matches either between Kidero Dido & Khwarshi or between Kidero Dido & Hinukh. Geographical distribution and a high number of specific cultural words, shared by Kidero Dido and Hinukh (Ya. Testelets, p.c.), suggest that this pair is expected to have secondary contacts. Since Sagada Dido (which is not adjacent to the Hinukh territory) demonstrates the same closeness to Hinukh as Kidero Dido does (76.2% ~ 76.4%), it is more likely that the normal direction of the influence is Kidero Dido > Hinukh rather than vice versa.

Second, comparison with East Tsezic lects also demonstrates irregular ratios. Four sets of three languages can be analyzed.

1) Hunzib proper (ETs) / Bezhta proper (ETs) / Hinukh (WTs). The configuration is normal: two East Tsezic lects are close to each other (86.8%) and equally remote from the West Tsezic lect (62.9% ~ 63.6%).

2) Hunzib proper (ETs) / Bezhta proper (ETs) / Kidero Dido (WTs). The configuration is normal: two East Tsezic lects are close to each other (86.8%) and equally remote from the West Tsezic lect (54.3% ~ 55.3%).

3) Hinukh (WTs) / Kidero Dido (WTs) / Hunzib proper (ETs). The configuration is not quite normal: two West Tsezic lects are indeed close to each other (76.4%), but not equally remote from the East Tsezic lect: Hinukh / Hunzib = 62.9%, whereas Kidero Dido / Hunzib = only 55.3% (the difference is 7.6).

4) Hinukh (WTs) / Kidero Dido (WTs) / Bezhta proper (ETs). The configuration is even more abnormal: two West Tsezic lects are indeed close to each other (76.4%), but not equally remote from the East Tsezic lect: Hinukh / Bezhta = 63.6%, whereas Kidero Dido / Bezhta = only 54.3% (the difference is 9.3).

As follows from this analysis, the lexicostatistical distances between two West Tsezic and one East Tsezic lects do not satisfy the condition of additivity. Hinukh demonstrates abnormal closeness to East Tsezic lects, both to Bezhta and Hunzib. Such a closeness should be treated as secondary, i.e., we assume a number of secondary lexical matches between Proto-Hinukh and Proto-East Tsezic. This can be explained as a result of serious influence in between Proto-Hinukh and Proto-East Tsezic, although the default direction of influence, Proto-East Tsezic > Proto-Hinukh or vice versa, cannot be established by means of such a formal analysis.

Thus, we could suppose two stages in the history of Hinukh. Initially, Hinukh entered into close contact with Proto-East Tsezic and subsequently Bezhta (the direction of influence is not entirely clear). Later, Hinukh was influenced by the neighboring Dido (especially Kidero Dido). Cf. a similar statement by Forker: “there has been and there still is extensive contact between Hinuq speakers and speakers of two other Tsezic languages, Bezhta and Tsez” [Forker 2013: 12]. Forker also attributes Hinukh-Dido contacts to the present time: “Many Hinuq men marry Tsez women, who then move to the village of Hinuq. These women often do not fully acquire the Hinuq language and sometimes simply continue to speak Tsez, at least at home” [Forker 2013: 16].

Database compiled and annotated by:
Hunzib (proper): A. Kassian, October 2013 / revised November 2013 (minor corrections) / revised July 2014 (minor corrections). We are thankful to Yakov Testelets (Moscow) for a number of valuable remarks on Hunzib data.
Bezhta (Bezhta proper, Khoshar-Khota, Tlyadal): A. Kassian, November 2013 / revised January 2014 (minor corrections) / revised July 2014 (minor corrections) / revised April 2015 (several lexical additions and corrections). We are thankful to Yakov Testelets (Moscow) & Madzhid Khalilov (Makhachkala) for a number of valuable remarks on Bezhta data.
Hinukh: A. Kassian, December 2013 / revised July 2014 (minor corrections).
Dido (Kidero): A. Kassian, January 2014 / revised July 2014 (minor corrections).
Dido (Sagada): A. Kassian, April 2014 (using field records by Arsen Abdullaev) / revised July 2014 (minor corrections).
Khwarshi (Khwarshi proper, Inkhokwari): A. Kassian, April 2014 (using field records by Raisat Karimova) / revised July 2014 (minor corrections).
Proto-Tsezic: A. Kassian, July 2014 / revised April 2015 (minor corrections).